Search CORE

170 research outputs found

Privacy risk assessment of emerging machine learning paradigms

Author: Xinlei He
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2023
Field of study

Machine learning (ML) has progressed tremendously, and data is the key factor to drive such development. However, there are two main challenges regarding collecting the data and handling it with ML models. First, the acquisition of high-quality labeled data can be difficult and expensive due to the need for extensive human annotation. Second, to model the complex relationship between entities, e.g., social networks or molecule structures, graphs have been leveraged. However, conventional ML models may not effectively handle graph data due to the non-linear and complex nature of the relationships between nodes. To address these challenges, recent developments in semi-supervised learning and self-supervised learning have been introduced to leverage unlabeled data for ML tasks. In addition, a new family of ML models known as graph neural networks has been proposed to tackle the challenges associated with graph data. Despite being powerful, the potential privacy risk stemming from these paradigms should also be taken into account. In this dissertation, we perform the privacy risk assessment of the emerging machine learning paradigms. Firstly, we investigate the membership privacy leakage stemming from semi-supervised learning. Concretely, we propose the first data augmentation-based membership inference attack that is tailored to the training paradigm of semi-supervised learning methods. Secondly, we quantify the privacy leakage of self-supervised learning through the lens of membership inference attacks and attribute inference attacks. Thirdly, we study the privacy implications of training GNNs on graphs. In particular, we propose the first attack to steal a graph from the outputs of a GNN model that is trained on the graph. Finally, we also explore potential defense mechanisms to mitigate these attacks.Maschinelles Lernen (ML) hat enorme Fortschritte gemacht, und Daten sind der Schlüsselfaktor, um diese Entwicklung voranzutreiben. Es gibt jedoch zwei große Herausforderungen bei der Erfassung der Daten und deren Handhabung mit ML-Modellen. Erstens kann die Erfassung qualitativ hochwertiger beschrifteter Daten aufgrund der Notwendigkeit umfangreicher menschlicher Anmerkungen schwierig und teuer sein. Zweitens wurden Graphen genutzt, um die komplexe Beziehung zwischen Entitäten, z. B. sozialen Netzwerken oder Molekülstrukturen, zu modellieren. Herkömmliche ML Modelle können Diagrammdaten jedoch aufgrund der nichtlinearen und komplexen Natur der Beziehungen zwischen Knoten möglicherweise nicht effektiv handhaben. Um diesen Herausforderungen zu begegnen, wurden jüngste Entwicklungen im halbüberwachten Lernen und im selbstüberwachten Lernen eingeführt, um unbeschriftete Daten für ML Aufgaben zu nutzen. Darüber hinaus wurde eine neue Familie von ML-Modellen, bekannt als Graph Neural Networks, vorgeschlagen, um die Herausforderungen im Zusammenhang mit Graphdaten zu bewältigen. Obwohl sie leistungsfähig sind, sollte auch das potenzielle Datenschutzrisiko berücksichtigt werden, das sich aus diesen Paradigmen ergibt. In dieser Dissertation führen wir die Datenschutzrisikobewertung der aufkommenden Paradigmen des maschinellen Lernens durch. Erstens untersuchen wir die Datenschutzlecks der Mitgliedschaft, die sich aus halbüberwachtem Lernen ergeben. Konkret schlagen wir den ersten auf Datenaugmentation basierenden Mitgliedschafts-Inferenz-Angriff vor, der auf das Trainingsparadigma halbüberwachter Lernmethoden zugeschnitten ist. Zweitens quantifizieren wir das Durchsickern der Privatsphäre des selbstüberwachten Lernens durch die Linse von Mitgliedschafts-Inferenz-Angriffen und Attribut-Inferenz- Angriffen. Drittens untersuchen wir die Datenschutzauswirkungen des Trainings von GNNs auf Graphen. Insbesondere schlagen wir den ersten Angriff vor, um einen Graphen aus den Ausgaben eines GNN-Modells zu stehlen, das auf dem Graphen trainiert wird. Schließlich untersuchen wir auch mögliche Verteidigungsmechanismen, um diese Angriffe abzuschwächen

Universaar

Acronym

Stealing Links from Graph Neural Networks

Author: Backes Michael
Gong Neil Zhenqiang
He Xinlei
Jia Jinyuan
Zhang Yang
Publication venue
Publication date: 05/10/2020
Field of study

Graph data, such as chemical networks and social networks, may be deemed confidential/private because the data owner often spends lots of resources collecting the data or the data contains sensitive information, e.g., social relationships. Recently, neural networks were extended to graph data, which are known as graph neural networks (GNNs). Due to their superior performance, GNNs have many applications, such as healthcare analytics, recommender systems, and fraud detection. In this work, we propose the first attacks to steal a graph from the outputs of a GNN model that is trained on the graph. Specifically, given a black-box access to a GNN model, our attacks can infer whether there exists a link between any pair of nodes in the graph used to train the model. We call our attacks link stealing attacks. We propose a threat model to systematically characterize an adversary's background knowledge along three dimensions which in total leads to a comprehensive taxonomy of 8 different link stealing attacks. We propose multiple novel methods to realize these 8 attacks. Extensive experiments on 8 real-world datasets show that our attacks are effective at stealing links, e.g., AUC (area under the ROC curve) is above 0.95 in multiple cases. Our results indicate that the outputs of a GNN model reveal rich information about the structure of the graph used to train the model.Comment: To appear in the 30th Usenix Security Symposium, August 2021, Vancouver, B.C., Canad

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Test-Time Poisoning Attacks Against Test-Time Adaptation Models

Author: Cong Tianshuo
He Xinlei
Shen Yun
Zhang Yang
Publication venue
Publication date: 16/08/2023
Field of study

Deploying machine learning (ML) models in the wild is challenging as it suffers from distribution shifts, where the model trained on an original domain cannot generalize well to unforeseen diverse transfer domains. To address this challenge, several test-time adaptation (TTA) methods have been proposed to improve the generalization ability of the target pre-trained models under test data to cope with the shifted distribution. The success of TTA can be credited to the continuous fine-tuning of the target model according to the distributional hint from the test samples during test time. Despite being powerful, it also opens a new attack surface, i.e., test-time poisoning attacks, which are substantially different from previous poisoning attacks that occur during the training time of ML models (i.e., adversaries cannot intervene in the training process). In this paper, we perform the first test-time poisoning attack against four mainstream TTA methods, including TTT, DUA, TENT, and RPL. Concretely, we generate poisoned samples based on the surrogate models and feed them to the target TTA models. Experimental results show that the TTA methods are generally vulnerable to test-time poisoning attacks. For instance, the adversary can feed as few as 10 poisoned samples to degrade the performance of the target model from 76.20% to 41.83%. Our results demonstrate that TTA algorithms lacking a rigorous security assessment are unsuitable for deployment in real-life scenarios. As such, we advocate for the integration of defenses against test-time poisoning attacks into the design of TTA methods.Comment: To Appear in the 45th IEEE Symposium on Security and Privacy, May 20-23, 202

arXiv.org e-Print Archive

MGTBench: Benchmarking Machine-Generated Text Detection

Author: Backes Michael
Chen Zeyuan
He Xinlei
Shen Xinyue
Zhang Yang
Publication venue
Publication date: 26/03/2023
Field of study

Nowadays large language models (LLMs) have shown revolutionary power in a variety of natural language processing (NLP) tasks such as text classification, sentiment analysis, language translation, and question-answering. In this way, detecting machine-generated texts (MGTs) is becoming increasingly important as LLMs become more advanced and prevalent. These models can generate human-like language that can be difficult to distinguish from text written by a human, which raises concerns about authenticity, accountability, and potential bias. However, existing detection methods against MGTs are evaluated under different model architectures, datasets, and experimental settings, resulting in a lack of a comprehensive evaluation framework across different methodologies In this paper, we fill this gap by proposing the first benchmark framework for MGT detection, named MGTBench. Extensive evaluations on public datasets with curated answers generated by ChatGPT (the most representative and powerful LLMs thus far) show that most of the current detection methods perform less satisfactorily against MGTs. An exceptional case is ChatGPT Detector, which is trained with ChatGPT-generated texts and shows great performance in detecting MGTs. Nonetheless, we note that only a small fraction of adversarial-crafted perturbations on MGTs can evade the ChatGPT Detector, thus highlighting the need for more robust MGT detection methods. We envision that MGTBench will serve as a benchmark tool to accelerate future investigations involving the evaluation of state-of-the-art MGT detection methods on their respective datasets and the development of more advanced MGT detection methods. Our source code and datasets are available at https://github.com/xinleihe/MGTBench

arXiv.org e-Print Archive

Generative Watermarking Against Unauthorized Subject-Driven Image Synthesis

Author: Backes Michael
He Xinlei
Li Zheng
Ma Yihan
Zhang Yang
Zhao Zhengyu
Publication venue
Publication date: 13/06/2023
Field of study

Large text-to-image models have shown remarkable performance in synthesizing high-quality images. In particular, the subject-driven model makes it possible to personalize the image synthesis for a specific subject, e.g., a human face or an artistic style, by fine-tuning the generic text-to-image model with a few images from that subject. Nevertheless, misuse of subject-driven image synthesis may violate the authority of subject owners. For example, malicious users may use subject-driven synthesis to mimic specific artistic styles or to create fake facial images without authorization. To protect subject owners against such misuse, recent attempts have commonly relied on adversarial examples to indiscriminately disrupt subject-driven image synthesis. However, this essentially prevents any benign use of subject-driven synthesis based on protected images. In this paper, we take a different angle and aim at protection without sacrificing the utility of protected images for general synthesis purposes. Specifically, we propose GenWatermark, a novel watermark system based on jointly learning a watermark generator and a detector. In particular, to help the watermark survive the subject-driven synthesis, we incorporate the synthesis process in learning GenWatermark by fine-tuning the detector with synthesized images for a specific subject. This operation is shown to largely improve the watermark detection accuracy and also ensure the uniqueness of the watermark for each individual subject. Extensive experiments validate the effectiveness of GenWatermark, especially in practical scenarios with unknown models and text prompts (74% Acc.), as well as partial data watermarking (80% Acc. for 1/4 watermarking). We also demonstrate the robustness of GenWatermark to two potential countermeasures that substantially degrade the synthesis quality

arXiv.org e-Print Archive

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Author: Backes Michael
He Xinlei
Qu Yiting
Shen Xinyue
Zannettou Savvas
Zhang Yang
Publication venue
Publication date: 23/05/2023
Field of study

State-of-the-art Text-to-Image models like Stable Diffusion and DALLE

\cdot

2 are revolutionizing how people generate visual content. At the same time, society has serious concerns about how adversaries can exploit such models to generate unsafe images. In this work, we focus on demystifying the generation of unsafe images and hateful memes from Text-to-Image models. We first construct a typology of unsafe images consisting of five categories (sexually explicit, violent, disturbing, hateful, and political). Then, we assess the proportion of unsafe images generated by four advanced Text-to-Image models using four prompt datasets. We find that these models can generate a substantial percentage of unsafe images; across four models and four prompt datasets, 14.56% of all generated images are unsafe. When comparing the four models, we find different risk levels, with Stable Diffusion being the most prone to generating unsafe content (18.92% of all generated images are unsafe). Given Stable Diffusion's tendency to generate more unsafe content, we evaluate its potential to generate hateful meme variants if exploited by an adversary to attack a specific individual or community. We employ three image editing methods, DreamBooth, Textual Inversion, and SDEdit, which are supported by Stable Diffusion. Our evaluation result shows that 24% of the generated images using DreamBooth are hateful meme variants that present the features of the original hateful meme and the target individual/community; these generated images are comparable to hateful meme variants collected from the real world. Overall, our results demonstrate that the danger of large-scale generation of unsafe images is imminent. We discuss several mitigating measures, such as curating training data, regulating prompts, and implementing safety filters, and encourage better safeguard tools to be developed to prevent unsafe generation.Comment: To Appear in the ACM Conference on Computer and Communications Security, November 26, 202

arXiv.org e-Print Archive

Deformation rule of bored pile & steel support for deep foundation pit in sandy pebble geology

Author: Chen Chaobo
Cheng Xuansheng
He Jiuru
Li Xinlei
Su Hongling
Xia Qingchun
Publication venue: Czech Technical University in Prague, Faculty of Civil Engineering
Publication date: 30/10/2023
Field of study

Regarding the whole excavation process of the support system of the Southwest Jiaotong University Station of Chengdu Metro Line 6 (the deep foundation pit bored pile + steel support and support system) as the engineering background, this paper studies the deformation rule of the deep foundation pit bored pile + steel support of the sandy pebble foundation. The deformation rule of this support system, the settlement rule of the ground surface outside the pit, and the rule of the uplift of the loose at the bottom of the pit are studied. A key analysis of the positive corner of the foundation pit is conducted, and the rationality of the optimization of the support scheme is evaluated. This paper provides effective guidance for the subsequent deep foundation pit construction and provides a reference for deep foundation pit construction

CTU Open Journal Systems (Czech Technical University, Prague / České vysoké učení technické v Praze)